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Abstract. We present a detailed description of the methods used to compute the three-dimensional two-point 
galaxy correlation function in the VIMOS- VLT deep survey (VVDS). We investigate how instrumental selection 
effects and observational biases affect the measurements and identify the methods to correct for them. We quantify 
the accuracy of our corrections using an ensemble of 50 mock galaxy surveys generated with the GallCS semi- 
analytic model of galaxy formation which incorporate the selection biases and tiling strategy of the real data. 
We demonstrate that we are able to recover the real-space two-point correlation function ^(s) and the projected 
correlation function Wp{rp) to an accuracy better than 10% on scales larger than 1 h~^ Mpc with the sampling 
strategy used for the first epoch VVDS data. The large number of simulated surveys allows us to provide a reliable 
estimate of the cosmic variance on the measurements of the correlation length ro at z ~ 1, of about 15-20% for 
the first epoch VVDS observation while any residual systematic effect in the measurements of ro is always below 
5%. The error estimation and measurement techniques outlined in this paper are being used in several parallel 
studies which investigate in detail the clustering properties of galaxies in the VVDS. 

Key words, cosmology: deep redshift surveys - large scale structure of Universe - methods: statistical - galaxies: 
evolution 
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1. Introduction 

The VIMOS VL T Deep Survey (VVDS, 
|Le Fevre et al., 2005a| is dedicated to study the evolution 
of galaxies and large scale structure to z 2 with a 
significant fraction of galaxies reaching z ^ 4. The VVDS 
spectroscopic survey is performed with the VIMOS 
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spectrograph at the European Southern Observatory 
Very Large Telescope and complemented with multi-color 
BVRI imaging data obtained at the CFHT telescope 
dMcCracken et al., 2003| |Le Fevre et al., 2004| ). The 
complete survey will consist of four fields of 2° by 2° each, 
with multi-band photometry coverage in the BVRI (and 
partly UJK) bands. Multi-object spectroscopy down to 
Iab = 22.5 is being obtained over the four fields ("VVDS 
Wide"), with a deeper area of 1.5 deg^ in the VVDS-02h 
and in the Chandra Deep Field South (VVDS-CDFS) 
covered to Iab = 24 ("VVDS Deep"). The first epoch 
VVDS data consist of more than 11000 spectra obtained 
in the VVDS-Deep fields ( |Le Fevre et al., 2005a| ). 

One of the key science goals of the VVDS is to mea- 
sure the evolution of galaxy clustering from the present 
epoch up to z ~ 2. The simplest statistic used for this 
analysis is the spatial two-point correlation function ^(r) 
and its variants, (e.g. [Peebles, 1980| ), i.e. the second mo- 
ment of the galaxy distribution. Given the geometry and 
selection function of galaxy surveys, however, the practical 
estimation of ^(r) from the actual data is not straightfor- 
ward. Edge effects, sampling inhomogeneities and selec- 
tion effects all introduce different biases that hamper the 
survey's ability to estimate the true underlying cluster- 
ing process. Moreover, intrinsic systematic uncertainties 
due to the limited size of the volume of the Universe ex- 
plored ( "cosmic variance" ) need to be accounted for when 
computing realistic error bars on the measured correlation 
values. 

The aim of this paper is to present a comprehensive 
description of the biases specific to the VVDS, along with 
the methods we developed to correct for them. The strat- 
egy we adopt relies on the construction of realistic "pre- 
observation" mock catalogs using the MoMaF software 
( [Blaizot et al., 2005)) and the GallCS hybrid model for 
galaxy formation jHatton et al., 2003| ). We then observe 
these mock catalogs, by mimicking the relevant observa- 
tional selections and biases. Comparing original and ob- 
served mock surveys allows us to (i) quantitatively un- 
derstand the impact of the different biases inherent to 
the VVDS data on clustering estimates, and (ii) to ex- 
plore and validate methods that allow us to recover the 
original signal. This strategy is possible because GallCS 
predictions have been shown to agree fairly well with 
a wide range of observations (e.g. [Hatton et al., 20031 
[Blaizot et al., 2004| ), and is thus expected to yield cata- 
logs realistic enough to carry out a convincing consistency 
check. Because our mock catalogs contain realistic cluster- 
ing properties, we can also use them to predict the cosmic 
variance amplitude in order to compute realistic errors on 
the clustering estimates we will perform on the real data. 

The paper is organized as follows. In section 2 we 
discuss the different kind of biases expected in the 
current VVDS first-epoch data. In section 3 we dis- 
cuss the construction of mock VVDS catalogs from the 
GallCS/ MoMaf simulations which assume a flat Cold 
Dark Matter model with ri,„ — 0.333, JIa = 0.667 and 
h = 0.667. In section 4 we present the definitions of the 




Fig. 1. Lay-out of the VIMOS field of view. INVAR masks 
with laser-cut slits are placed on the focal plane within the 
four rectangular areas ("VIMOS channels"). 



two-point correlation functions. Then, in section 5 we dis- 
cuss the details of the error measurement strategy when 
applied to VVDS. In section 6 we show how the measured 
two-point correlation function is affected by the features 
particular to our survey and we discuss the methods devel- 
oped to correct for these biases and properly estimate the 
correlation function ^(r^, tt), its projection Wp{rp), and the 
correlation length ro and slope 7, as a function of redshift. 
Section 7 summarizes our results. 



2. The selection function of VVDS first epoch 
observations 

The first epoch spectra of the VVDS-Deep collected dur- 
ing the 2002 and 2003 campaigns are concentrated within 
the 02h deep field, and the CDFS ( |Le Fevre et al., 2005a| ). 
First epoch spectra have been collected for galaxies down 
to Iab < 24 in the 0.61 sq. degree sub-area of the 
VVDS-02h field and a region of 21 x 21.6 sq. arcmin- 
utes centered on the Chandra Deep Field South (CDFS, 
IGiacconi et al., 2002| ). The VVDS First Epoch data ge- 
ometrical lay-out, sampling rate and incompleteness are 
used as a reference benchmark in this paper. 

2.1. Catalog structure and biases 

A number of factors, both in the parent photometric cata- 
log from which the target galaxies are selected and in the 
way the spectroscopic observations are carried out, con- 
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Fig. 2. Galaxy distribution in a mock VVDS-02h catalog, constructed using the GallCS simulations with the same 
lay-out as the 20 observed pointings in the actual first-epoch VVDS field and applying the full range of selection 
effects present in the data, as e.g. the photometric mask. The left panel shows the parent photometric field, including 
all objects with Iab < 24 within the current VVDS-02h boundaries and mask . In the right panel only the objects 
selected for spectroscopy are shown. Note the density gradient towards the central part of the field, due to multiple 
passes over the same area. 



tribute to create selection effects that bias any estimate 
of galaxy clustering if not properly accounted for. 

1. Photometric defects. Some areas are excised from 
the I-band CCD images during their photometric anal- 
ysis, due to the presence of bright stars or other in- 
strumental effects (e.g. stray-light from a bright star 
outside the field of view). The resulting photomet- 
ric galaxy catalog, therefore, features some artificially 
empty regions. 

2. VIMOS lay-out. The field of view of the VIMOS 
spectrograph consists of four 7' by 8' quadrants, sepa- 
rated by 2' gaps, as shown schematically in Figure ^ 
At the typical resolution used in the VVDS, between 
110 and 150 spectra are collected in each quadrant 
during a single observation. Clearly, no galaxies are 
observed over the area of the "cross" between the four 
quadrants, unless one observes the area with a new 
pointing, shifted with respect to the first one (see be- 
low). 

3. Missing quadrants. For a few pointings, one or two 
quadrants can be "blind" , i.e. with no spectra observed 
due to a misplacement of the multi-slit masks during 
the observations. 

4. Incomplete coverage. The planned final area is be- 
ing covered through a mosaic of adjacent pointings. 
Thus, at any intermediate stage the available spectral 
data set is distributed in a non-uniform fashion on the 
sky. The largest contiguous area currently covered in 



the 02h deep field corresponds to about 0.5 square de- 
grees, with the geometry shown in Figure [21 

5. Varying sampling density. The VVDS observa- 
tional strategy involves multiple passes over the same 
area to increase the spectral sampling rate. While a 
central region of the 02h deep field is exposed 4 times 
(i.e. it is visited by four independent pointings with dif- 
ferent slit masks), the external areas are covered only 
twice due to the tiling strategy. During subsequent ob- 
serving runs, the VIMOS pointings are shifted with 
respect to the previous ones usually by around 2', to 
ensure that the cross visible in Figure ^ is filled. As 
a consequence, the mean surface density of observed 
objects varies across the field. 

6. Optimization of the number of slits and me- 
chanical constraints. A specific source of bias in 
the VIMOS observations is introduced by VMMPS 
- the VIMOS Mask Manufacturing Preparation 
Software, and specifically by the Super-SPOC code 
( |Bottini et al., 2005) ). The width of a slit is set to 1 
arcsecond (or about 5 detector pixels), and its typical 
length is 6 — 10 arcseconds to include both the object 
of interest and enough information on the sky spectral 
background to correct for it. The VMMPS software au- 
tomatically allocates slits to objects in the input cat- 
alog with the goal of maximizing the total number of 
spectra. In general, this means that the spectroscopic 
sample is not a random sparse sampling of the cluster- 
ing pattern over the sky, but a more homogeneous sub- 
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VIMOS quadrant with SSPOC applied 
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Fig. 3. Spectroscopic targets (filled circles) selected in 
one of the four VIMOS quadrants from a complete VVDS 
mock photometric sample (open circles). Note how the op- 
timization software tends to select spectroscopic targets 
aligned along horizontal rows, while, clearly, very close 
pairs are not observed. Typically, however, 4 independent 
observations are conducted on the same area, each with 
a similar target layout, but shifted by a few arcminutes. 
This significantly reduces both the alignment and proxim- 
ity effects. The residual bias is then further corrected by 
the weighting scheme discussed in § 4. Overall, the four 
passes produce a typical sampling rate of one galaxy in 
four. 



sample. Specifically, VMMPS tends to place objects in 
rows, so to maximize the number of spectra across the 
CCD (see Figure O)), with an additional slight prefer- 
ence towards objects of small angular size. As typical 
with multi-object spectrographs, the minimum slit size 
imphes that, after one single spectroscopic pass, there 
is a bias against observing very close angular pairs on 
the sky. Having multiple passes, however, significantly 
improves the situation, allowing for very close pairs to 
be observed in subsequent exposures. 

The final spectroscopic sample is thus affected to dif- 
ferent degrees by all these factors. Figure |21 shows the cur- 
rent lay-out of the observed pointings in the 02h field, 
compared to the parent photometric sample over the same 
area. Features from the two main effects are obvious from 
Figure[21 holes in the parent catalog and the varying sam- 
pling density in the spectroscopic data, due to the multi- 
ple passes over the central area. The "striping" effect due 
to the slit-placing software is not obvious at this resolu- 
tion and is better appreciated in Figure |21 where only one 
quadrant is displayed. 



3. Constructing mock VVDS surveys 

The only way to understand the relative importance of 
the selection biases discussed above and test possible cor- 
rection schemes is to create and analyze realistic simu- 
lations of our survey. Provided these simulations are re- 
alistic enough, they allow us (1) to understand quanti- 
tatively the magnitude of observational biases on the fi- 
nal statistical quantities to be measured, and (2) to esti- 
mate realistic errors that include cosmic variance. Both 
these points require that mock observations contain a 
spatial distribution of galaxies consistent with VVDS 
observations - so as to measure clustering and cosmic 
variance - along with realistic photometric and physi- 
cal properties of simulated galaxies - so as to mimic se- 
lection effects. The GallCS model for galaxy formation 
( |Hatton et al., 20 03|) along wit h the MoMaF mock observ- 
ing tool ( Blaizot et al., 2005| ) fulfill these requirements 
and we thus use them to build "pre-observation" cata- 
logs that we then "observe" by progressively adding all 
the VVDS observational biases and selections. 

In this section, we first describe the GallCS simula- 
tion that we use, before discussing how we build simulated 
VVDS observations that account for all identified biases. 



3.1. The GallCS simulations 

GallCS (for Galaxies In Cosmological Simulations, see 
[Hatton et al., 20031 ) is a model of hierarchical galaxy for- 
mation which combines high resolution cosmological simu- 
lations to describe the dark matter content of the Universe 
with semi-analytic prescriptions to deal with the baryonic 
matter. 

The cosmological N-body simulation we refer to 
throughout this paper assumes a flat cold dark mat- 
ter model with a cosmological constant (i7m = 0.333, 
flA — 0.667). The simulated volume is a cube of side 
Lbox = 100/i~^Mpc, with h = 0.667, containing 256'^ par- 
ticles of mass 8.272 x IO^Mq, with a smoothing length of 
29.29 kpc. The power spectrum was set in agreement with 
the present-day abundance of rich clusters {as = 0.88, 
from |Eke et al., 1996| ), and the DM density field was 
evolved from z=35.59 to z=0, outputting 100 snapshots 
spaced logarithmically in the expansion factor. 

GalIGS builds galaxies from this simulation in two 
steps. First, halos of DM containing more than 20 parti- 
cles are identified in each snapshot using a friend-of-friend 
algorithm. Their merging history trees are then computed 
following the constituent particles from one output to the 
next. Second, baryons are evolved within these halo merg- 
ing history trees according to a set of semi-analytic pre- 
scriptions that aim to account for e.g. heating and cooling 
of the gas within halos, star formation and its feedback on 
the environment, stellar population evolution and metal 
enrichment, formation of spheroids through galaxy merg- 
ers or disc instabilities. 

Three main points make GallCS particularly suitable 
for this study. First, this model yields a wide range of 
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predictions, among which luminosities (in many bands 
from the UV to the sub- mm), physical properties (such as 
sizes of galaxies), and the positions of galaxies within the 
simulation snapshots. Second, these properties have been 
shown to be in a rather good agreement with various ob- 
servations (e.g. [Hatton et al., 2003 ^, Blaizo t et al., 2004| ). 
Third, mock observations are readily available from the 
GallCS Project's web-page^. These mock observations in- 
clude 50 catalogs of 1 x 1 sq. deg. that contain all the 
information we need in this study: apparent magnitudes 
in the BVRI filters used at the CFHT, apparent sizes of 
the galaxies, angular coordinates in the mock sky, and 
redshifts. 

Before using GallCS mock samples, it is useful to state 
their limitations (see however [Blaizot et al., 2005} for a 
thorough description of these). There are mainly three 
shortcomings to mock catalogs made using GallCS. First, 
because of the finite mass resolution of the root simula- 
tion, faint galaxies are not well described, or even missed 
when they lie in unresolved haloes. This is not an issue for 
the present study, however, because the VVDS detection 
limit is brighter than GallCS's resolution. Second, because 
mock catalogs are built from a simulation of a finite vol- 
ume, they involve replications of this volume, along and 
perpendicular to the line of sight. These replications lead 
to some negative bias in the correlation functions, of at 
most ~ 10%. This is not a concern in this paper, because 
we just need an approximate match with the observed 
data in order to perform an internal consistency check. 
GallCS catalogs do provide an adequate match. Third, 
the mock catalogs do not describe density fluctuations on 
scales larger than the size of the simulated volume (~ 100 
h~-'^Mpc). This implies that cosmic variance estimates are 
likely to be under-estimated if the volume probed by a 
mock catalog is larger than the simulated volume. This 
under- estimate, however, depends on the galaxy popula- 
tion considered: it will be large for rare objects and small 
for "normal" galaxies. In other words, because cosmic vari- 
ance is basically given by the integral of the correlation 
function over the survey, the error on the estimated cosmic 
variance depends on how much of this integral we miss, 
that is, on how strongly the studied galaxies are clustered. 
From Fig. 9, it can be seen that the size of the simulation 
is enough for this under-estimate to be small at the scales 
we consider (i.e. from 0.1 to 10 h~^Mpc). The dispersion 
found among the 50 GallCS cones is thus expected to be 
a good estimate of cosmic variance. The mean number of 
galaxies with 17.5 < Iab < 24 in the artificial catalogs is 
77396. The average redshift distribution of these 50 cones 
is shown in Figure 0] along with the VVDS first epoch 
N{z) | |Le Fevre et al., 2005a| ). 

We note that the redshift distribution of the sim- 
ulated galaxies differs significantly from that observed 
by the VVDS for the real Universe. This is simply 
telling us that the semi-analytic galaxy formation model 
adopted to construct the GallCS simulations, while ade- 

^ pittp : //galics . iap.f r| 



T I I I I I I , , I I , I r 

Nyypg(z) 




redshift z 



Fig. 4. Average redshift distribution in the 50 mock 
VVDS-02h surveys, normalized by the number of objects 
in each cone, compared to the redshift distribution of 
the observed VVDS galaxies. Note how the semi-analytic 
model of galaxy formation used to construct the GallCS 
simulations differs from the real data. This is not a con- 
cern for the purposes of this work: first, we are perform- 
ing internal tests of the effect of observing biases and on 
their correction, which depends on the small-medium scale 
clustering properties. Second, when error bars are esti- 
mated for a specific redshift slice, their amplitude is re- 
normalized accordingly, to account for the different num- 
ber of galaxies. 

quately reproducing a number of observed features (see 
[Blaizot et al., 2005| ) is not 100% correct. This, however, is 
of no importance for the current analysis, as our main goal 
is to test the internal differences in the measured quanti- 
ties when either the original parent sample or the final 
spectroscopic sample are observed. The accuracy of these 
tests depends essentially on the small-scale properties of 
the simulated galaxies (like the mean inter-galaxy sepa- 
ration and clustering), rather than on the global redshift 
distribution. Conversely, in the estimate of error bars the 
difference in absolute numbers between the real and simu- 
lated samples within a given redshift slice will clearly have 
to be taken into account. 

3.2. CCD photometric mask 

Bright (often saturated) stars represent a practical ob- 
stacle to accurate galaxy photometry and their diffused 
light can affect large areas of a CCD astronomical im- 
age. All such areas were excised from the VVDS pho- 
tometric catalogs: there are no sources in these regions 
JMcCracken et al., 200'3| ). Similarly, a "dead" area in the 
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02h field has been produced by a beam of scattered light 
that crosses a large part of the field from North-East 
to South- West. In total, a few percent of the total area 
are lost due to these defaults. The information on these 
"holes" in the photometric catalog is stored in a FITS bi- 
nary mask, with null values corresponding to dead pixels. 
We have used this mask on the mock samples to exactly 
reproduce the pattern of the observed data in our simula- 
tions. 



3.3. Effect of galaxy angular sizes 

In order to maximize the number of spectroscopic targets, 
the Super-SPOC software ( |Bottini et al., 2005| ) makes a 
choice of a targeted galaxy based also on the galaxy pro- 
jected angular radius along the slit direction. This means 
that smaller galaxies are sometimes preferred as they al- 
low the program to increase the number of targets. Any 
realistically simulated spectroscopic sample must take this 
into account. Therefore, we have computed for each sim- 
ulated galaxy in GallCS a realistic angular radius, using 
the following procedure. 

GallCS describes galaxies with three components : 
a disc, a bulge and possibly a nuclear starburst. For 
each of these, the model predicts the mass and a scale- 
length that assumes the disc is exponential while the 
other two spheroidal components follow a Hernquist pro- 
file dHernquist, 19 90). We used these sizes to define an 
overall radius for each galaxy, which encloses 90% of the 
total mass. Assuming that light has the same distribu- 
tion as mass, we then convert this radius to an apparent 
angular size, assuming the above-mentioned cosmology. 



3.4. Artificial stars 

The VVDS spectroscopic targets are selected purely on 
magnitude, Iab < 24 and Iab < 22.5 in the Deep and 
Wide parts of the survey, respectively, without any a pri- 
ori star-galaxy separation. This avoids biases against com- 
pact galaxies and AGNs which may be introduced at faint 
magnitudes by unreliable star-galaxy classification based 
on morphology. Consequently, our spectroscopic sample is 
contaminated by stars. About 8.5% of the collected spec- 
tra in the VVDS-Deep are stars and are discarded (the 
exact number depending on galactic latitude can be as 
high as 20% in some cases for the "Wide" survey). These 
stars obviously have no impact on the clustering analysis. 
Their only effect is to reduce the total number of targeted 
galaxies, thus slightly affecting the overall statistics by in- 
creasing the expected variance. Since our aim here is to 
precisely quantify the biases and uncertainties on galaxy 
correlations computed from the final spectroscopic sam- 
ple, and compare them to the original parent sample, we 
decided to also take into account this small contribution. 
We therefore added to the artificial survey fields a set of 
simulated stars. 
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Fig. 5. Number counts of artificial stars added to the 
GallCS simulation, compared to the actual counts of stars 
in the VVDS-02h field, identified morphologically from 
the photometric data. The excess in the VVDS above 
Iab — 20 is due to the inability of the morphological 
compactness criteria to discriminate stars from galaxies 
and QSOs at faint magnitudes. When this is taken into 



account, the models from Robin et al. (2003) reproduce 
very well the actual distribution of stellar objects in the 
VVDS. 



Using the on-line tool of Robin et al. (2003) ^ we cre- 
ated a one-square-degree catalog of artificial stars with 
17.5 Iab< 24, which was added to the artificial galaxy 
photometric catalogs. Figure shows the number counts 
of the added stars, compared to the observed distribu- 
tion at bright magnitudes in the 02h field (as identified 
by S-extractor, [Bertin and Arnouts, 1996| ). The observed 
excess above Iab = 20 in the 02h field is the effect of 
mis-classified galaxies and QSOs, which also corroborates 
our choice of excluding any pre-selection for the VVDS 
spectroscopy, to avoid throwing these objects away. 

As this parameter is used by VMMPS, apparent angu- 
lar radii have also been assigned to artificial stars, using 
the observed distribution of stellar sizes in the 02h field, 
identified photometrically down to Iab —21 and spectro- 
scopically at fainter magnitudes. This range of apparent 
stellar radii corresponds to the sizes of the point spread 
function ("seeing") at the faint Kron radii measured for 
stars by S-extractor. 



^ The Model of stellar population synthesis of the Galaxy 
developed by [Robin et al. (2003)] produces a reliable catalog 
of stars with appropriate number counts and magnitudes in 
the visible and near-infrared spectral ranges in the Johnson- 
Cousins and Koornneef systems, respectively. 
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Fig. 6. Spectroscopic success rate per magnitude bin in 
the VVDS 02h field, including only those redshifts used 
for the clustering analysis 
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Fig. 7. Average redshift distribution in the GallCS mock 
catalogs before and after the full observing strategy is ap- 
plied. No bias in the redshift distribution is observed. 



3.5. Spectroscopic success rate 

Objects selected by the slit-positioning code do not yet 
form the final redshift catalog. For some of the objects, 
redshift measurements are impossible, usually because of 
poor signal-to- noise. This incompleteness is clearly a func- 
tion of magnitude. We define the spectroscopic success 
rate as the ratio of the number of redshifts used for clus- 
tering analysis to the total number of spectroscopically 
observed objects. Figure El shows the spectroscopic suc- 
cess rate as a function of magnitude, which corresponds in 
practice to the probability of measuring the correct red- 
shift of a galaxy as a function of its magnitude in the 
current observational configuration. Overall, this shows 
that we are able to obtain a redshift for more than 80% 
of the targeted objects between Iab — 17.5 and 24. We 
therefore apply this same probability function to our mock 
"observed" catalogs, rejecting the corresponding fraction 
of targeted objects. We make the simplifying assumption 
that the spectroscopic success rate is the same for all 
galaxy types. 

3.6. VIMOS spectral resolution 

The last point to be taken into account to produce a 
fully realistic mock redshift catalog is the resolution of 
the VIMOS spectrograph in the set-up used for the VVDS 
(Low-resolution RED Grism, R ~ 230) which translates 
into a typical rms error on the measured redshift which 
is around (Tcz — 275 km/s. We therefore added to the fi- 
nal set of mock redshifts a Gaussian-distributed dispersion 
with the same rms and zero mean. 



3. 7. Overall properties of mock VVDS surveys 

All of the steps described above have been applied to each 
of the 50 one-square-degree GallCS surveys, producing a 
corresponding number of mock redshift samples which re- 
produce with fidelity the lay-out, properties and biases of 
the first-epoch VVDS 02h sample. 

Figure 13 shows that, despite the shght bias of SSPOC 
towards choosing smaller (and therefore fainter) objects, 
the redshift distribution N{z) of the final spectroscopic 
samples is unbiased with respect to the original complete 
GallCS one-square-degree survey. The difference observed 
in Figure 01 between the original and observed simulated 
cones is therefore only the result of the model of galaxy 
formation adopted for the simulation, and not of a selec- 
tion effect. There was no way we could introduce, e.g., a 
stronger incompleteness in the final N{z) at 2; > 1. 



4. Two-point correlation statistics 

4.1. General estimator 

The two-point correlation function ^(r) is defined as the 
excess probability above random that a pair of galaxies is 
observed at a given spatial separation r ( [Peebles, 1980| |. 

It is the simplest statistical measurement of cluster- 
ing, as a function of scale, and it corresponds to the sec- 
ond moment of the distribution. Various recipes have been 
proposed to estimate two-point correlation functions from 
galaxy surveys, in particular to minimize the biases intro- 
duced by the finite sample volume, edge effects, and photo- 
metric masks dHamilton, 1993| |Landy and Szalay, 1993| ) . 
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Here we adopt the Landy-Szalay estimator, that expresses 
^(r) as 



NniNR - 1) GG(r) GR{r) 



Ng{Ng - 1) RR{r) 



- 2- 



Ng RR{r) 



1 



(1) 



In this expression, Ng and Nr are the mean density (or, 
equivalently, the total number) of objects respectively in 
the galaxy sample and in a catalog of random points dis- 
tributed within the same survey volume and with the same 
redshift distribution and angular selection biases; GG(r) 
is the number of independent galaxy-galaxy pairs with 
separation between r and r + dr; RR{r) is the number 
of independent random-random pairs within the same in- 
terval of separations and GR{r) represents the number of 
galaxy-random pairs. 

4.2. Redshift-space correlations 

We know that the three-dimensional galaxy distribution 
recovered from a redshift survey is distorted due to the 
effect of peculiar velocities. For this reason, the redshift- 
space separation s differs from the true physical comoving 
separation r between two galaxies. Since random veloci- 
ties affect only redshift and not position on the sky, the 
stretching occurs only radially. Redshift distortions can be 
measured and separated from true spatial correlations by 
computing the function ^(j-p, tt), where the separation vec- 
tor of a pair of galaxies s is split into two components: tt 
and Tp, respectively parallel and perpendicular to the line 
of sight. Given two objects at redshifts zi and Z2, with ob- 
served radial velocities vi = czi and V2 = CZ2 (c being the 
speed of light) , we can define ( [Fisher et al., 1994( ) the line 
of sight vector I = {vl + v2)/2 and the redshift difference 
vector s = vl v2. and also: 



si 



9 s ■ s 

r = 



(2) 



Equation^can be generalized to the case of ^(r^, tt), if we 
count the number of pairs in a grid of bins Avp and Att 
instead of singular bins Ar or As. 

Observed distortions in galaxy surveys can be sepa- 
rated into two main contributions: on small scales, the 
distortion is dominated by random internal velocities in 
groups and clusters, causing a stretching of ^{rp, tt) along 
the TT direction (the so-called "fingers of God" effect). On 
large scales, on the other hand, Cii^pT^^) contours tend to 
be flatter, due to coherent infall of galaxies onto large-scale 
overdensities, known as the "Kaiser effect" ( Kaiser, 1987) . 
The latter is a weak effect and needs very large samples 
to be seen with sufficient accuracy, as shown by the 2dF 
survey HHawkins et al., 2003| ). 

4.3. Projected correlation function Wp{rp) 

We can recover the real-space correlation function ^(r) 
by projecting ^(rp,7r) along the line of sight, onto the rp 
axis. In this way we integrate out the dilution produced by 



the redshift-space distortion field and obtain a quantity, 
Wpi^p), which is independent of the redshift-space distor- 
tions: 



e(rp,7r)d2/ = 2 / Url + y^^'] dy.iS) 



Wp{rp) 



In the right-hand side of the equation, is simply the 
usual real-space two-point correlation function ^ (r) , eval- 



uated at the specific separation r = y r| + y"^. If we now 
assume a power-law model 



(4) 



with 7 being the slope of the correlation function and rp 
the correlation length, the integral can be computed ana- 
lytically, giving as a result 



Wp{rp) 



, x7r(i)r(V) 



(5) 



where F is Euler's Gamma function. 

5. Error estimate and fitting technique 

5.1. Error bars on correlation functions 

Ideally, if the studied data set consisted of a large enough 
number of statistically independent pairs, such that the 
central limit theorem applies, then the distribution of es- 
timates of ^ in an ensemble of similar samples should be 
Gaussian. The la uncertainty — the "cosmic error" — in 
^ would then be the square root of its variance < A^^ > 
UPeebles, 1973| ). However, the theoretical expression for 
< A^^ > depends on the poorly known and difficult to 
measure four-point correlation function. Moreover, since 
the measured ^ is not exactly coincident with the theo- 
retical ^, we expect its uncertainty to be also somewhat 
different from the value provided by the theory. This effect 
is known as a cosmic bias. 

A few different ways of estimating errors on two- 
point correlation functions have been used in the liter- 
ature (for a wider discussion, see e.g. [Hamilton, 19931 
[Fisher et al., 1994[ [Bernardeau et al., 2002[ ). The case 
closest to the ideal situation is when the survey is large 
enough that it can be split into a number of sub-samples. 
Correlations are then estimated independently for each 
of these, and error bars for the parent sample computed 
as the rms values. This has been for example the case 
of the angular correlation function from the APM survey 
(e.g. 'Ma ddox et al., 1990| ). However, the number of sub- 
samples cannot be large, otherwise the explored scales will 
be significantly reduced with respect to the parent survey. 
The consequence is that the variance is typically overes- 
timated and these represent usually upper limits to the 
true errors. 

Simple Poissonian errors (e.g. proportional to the 
square root of the total number of galaxy pairs in each 
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bin) underestimate the error bars substantially. Statistical 
corrections were proposed ( |Kaiser, 1986| ) by multiplying 
Poissonian errors by a factor 1 + AnnJ^, with n being the 
number density of objects and J3 = J ' r^£,{r)dr, where 
we assume that the actual correlation function vanishes for 
r > Vj. However, this method also tends to give relatively 
small errors HFisher et al., 1994| ). 

Over the last twenty years a widely used method 
has been the so-called "bootstrap resampling" 
( [Barrow et al., 1984| ). It is based on the idea of "perturb- 
ing" the data set, by randomly creating a large number of 
comparable "pseudo data-sets" , which dilfer only slightly 
from the original sample. If this contains N objects, then 
each bootstrap sample is created selecting N of these, 
but allowing for multiple selections of the same object. 
This means that some objects will not be included in one 
given pseudo data-set, while others will be counted twice 
or three times. This is a good test of the robustness of 
measured correlations, especially on large scales where 
having a large number of pairs does not always mean a 
robust measurement: consider for example the case of a 
single isolated galaxy at a separation of f from a cluster 
containing 1000 galaxies, ^(f) will contain a large number 
of pairs, however only one will be independent. On the 
other hand, bootstrap errors often tend to over-estimate 
the theoretical variance < A^^ >. In general, however, 
despite debates on their theoretical justification, they 
have represented a practical way to obtain error bars in 
correlation analysis which are not far from the true ones. 

The use of bootstraping became less and less popular 
in recent years, with the advent of large N-body simula- 
tions, reproducing the matter distribution over significant 
volumes of the Universe. Coupled to physically sound def- 
initions of "galaxies", these allowed the construction of 
sets of independent mock surveys, from which ensemble 
errors could be computed from the scatter in the different 
catalogs. This is the same technique used to construct our 
VVDS mock surveys. Clearly, a good match is necessary 
between the volume and resolution of the simulation, on 
one side, and the depth and size of the survey on the other. 
Furthermore, the power spectrum of the simulation must 
provide a realistic description of long waves, so to properly 
include cosmic variance. Progress both in our knowledge 
of structure on the largest scales and in the size and reso- 
lution of N-body simulations has improved on early appli- 
cations of this technique (.Fisher et al., 1994| ). For this rea- 
son, since the GallCS simulations are available, we could 
use this as our main method for error estimation. 

However, as we detail below, the covariance matrix re- 
constructed from the simulations cannot be applied in a 
straightforward way to the observed data. Indeed, our fit- 
ting technique, discussed below, handles the covariance 
matrix to properly account for bin-to-bin correlations 
when fitting correlation functions: when the covariance 
matrix extracted from the set of 50 mock VVDS surveys 
is used (after proper normalization of the average values), 
the fit is often unstable. In other words, the covariance ma- 
trix produced by the ensemble of mock surveys, although 



providing sufficiently realistic diagonal elements, has off- 
diagonal non-zero values which differ from those pertain- 
ing to the data sample (which of course are unknown). 
For this reason, we modified our strategy and resort to 
the bootstrap technique to estimate the bin-to-bin covari- 
ance. This means that our error bars on the estimated 
correlation functions are obtained via the more reliable 
scatter between the mock surveys, but a bootstrap is used 
to estimate the off-diagonal terms of the covariance ma- 
trix. 

5.2. Fitting correlation functions 

It is well known that fitting of correlation functions like 
^(s) or Wp{rp) cannot be performed via the standard least- 
squared method, due to the correlation existing among the 
different bins. The procedure we adopted to estimate the 
power-law parameters of ^(r), rp and 7 from the projected 
function Wp{rp), using eq.|5lfollows Fisher et al. (1994) and 
Guzzo et al. (1997), with some specific differences that are 
described in the following. 

Let us consider a given redshift slice [zi — Z2] . Within 
this same interval, we estimate the correlation function 
^{rp, tt) from: 1) 50 mock VVDS surveys; 2) the real VVDS 
data; 3) Nboot (typically 100) bootstrap resamphngs of the 
VVDS data. We then compute, for each of these estimates, 
Wp{rp), projecting £,{rp,TT) along the line of sight (eq. 13), 
with an upper integration limit TTmax, chosen in practice so 
that it is large enough to produce a stable estimate of Wp. 
Similarly to other authors (see e.g. |Guzzo et al., 1997| ), 
we find Wp(rp) quite insensitive to the choice of TTmax in 
the range of 15 h~^Mpc < iTmax < 25 h~^Mpc for Vp < 
10 h~^Mpc. Too small a value for this limit would miss 
small-scale power, while too large a value has the effect of 
adding noise into Wp. After a set of experiments we have 
chosen iTmax = 20 h~^Mpc. 

In the following, we call Wp{ri) the value of Wp, com- 
puted at Vp = Ti in the cone k, where 1 < fc < Ncaiics — 
50 if we consider the GallCS data or 1 < A; < Nboot if we 
consider the bootstrap data. If not otherwise mentioned, 
Nboot = 100 is used. 

Whether we consider the mock or bootstrap samples, 
we can always compute the associated covariance matrix, 
C, between the values of Wp in i*'' and A;*'' bins: 

C,k = (K(r.) - K(r,)),) K(rfe) - (u;^(rfc)),))„ (6) 

where indicates an average over all bootstrap or 

mock realizations. When the correlation function is com- 
puted from a finite sample, the values of ^(r) (or Wp{r)) 
at different separations are not independent^ For this 
reason one cannot use a straightforward minimiza- 
tion — which assumes that all bins are independent and 
that the errors follow the Gaussian distribution — to 

^ For example, imagine that one galaxy is removed from the 
sample: this galaxy contributes pairs at many different separa- 
tions, thus affecting virtually all bins in the correlation func- 
tion. 
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find the best-fit parameters of a model to the observed 
data. However, C is symmetric and real and therefore 
can be diagonalized by a unitary transformation if its 
determinant is non- vanishing. In practice, the estimated 
functions are oversampled, C is not singular and there- 
fore can be inverted by a simple Cholesky decompo- 
sition ( [Numerical Recipes, Press et al., 1992| Volume 1, 
Chapter 2)^. Then, if we now call H = C~^, we can fit 
by minimizing a generalized x^, which is defined 



VVDS 



as 



..VVDS/„ \\ u („,.mod/„ \ „,.VVDS( 



(7) 



w' 



'(r,))i/,,K'"'(r,)-<^^«(r,)), 



as a function of the two free parameters rg and 7 of 

In principle, the complete process could be done using 
only our set of 50 mock VVDS surveys. In practice, as 
explained above, the bin-to-bin covariance obtained from 
the GallCS mock samples does not provide a statistically 
stable matrix to be used with the generalized method. 
Therefore, we most appropriately used the covariance ma- 
trix obtained from the Nboot bootstrap resamplings of the 
galaxy data set. 

This provides the best solution for (rg, 7) data that min- 
imizes the error contour X6oot(^P'7)- same time, 
however, we use 50 mock surveys to obtain the most real- 
istic error contours X^(^p!7) our estimated (ro,7)data, 
as these - unlike bootstrap errors - include cosmic vari- 
ance. 

The final error contours, therefore, are obtained fit- 
ting the mean of the 50 Wp mock VVDS surveys, using a 
covariance matrix computed from the same 50 Wp. This 
process provides a solution for {ro,j)Gaiics associated 
with the error contours XGaiicsi''^PT^)- then re-center 
these contours around (rg, "f)data with the renormalization 



rp X {' 



„GalICS I r^data 




/rp and 7 <— 7 x (7 



GallCS /^.data 



I ^data'^ 



To take into account the different N{z) of GallCS and 
VVDS, we multiply the error contour XGaiics computed 
for each redshift slice by a factor NyvDS /NGaiicSi where 
NvvDS is the number of VVDS galaxies and Ncaiics is 
the number of GallCS galaxies in this redshift slice. 

The error bars computed as above for each Wp(ri) value 
correspond to the rms of the 50 Wp{ri), normalized to the 
data. 



6. Biasing effects and their removal 

We now quantitatively establish the impact of the VVDS 
selection effects on the measured correlations and the ac- 
curacy of our correcting scheme, using the GallCS mock 
samples. 



Note that if the number of bins we want to fit, i.e. the size of 
the matrix, were greater or equal to the number of realizations 
then, even if the matrix remains symmetric, the vectors would 
not be independent and the matrix C could not be inverted. 
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Fig. 8. Impact of the observational process on the esti- 
mate of the angular two-point correlation function lu{9) 
for one mock VVDS survey (open circles), compared to 
that of the original parent field (filled circles), for one 
mock VVDS cone. The large distortion, introduced by the 
observing strategy affects practically all angular scales. 



6.1. Impact on angular correlations 

As we have seen in the previous section, the biases and 
selection effects due to the observing strategy and in- 
strumental limitations affect the properties of the angu- 
lar distribution of objects, with respect to a random sub- 
sampling of galaxy clustering process. It is therefore the 
angular correlation function to (9) that will primarily re- 
flect these biases. Clearly, there is no specific scientific 
reason to measure the angular correlation function from 
the spectroscopic sample, as this can be done more eas- 
ily and with much greater confidence using the full VVDS 
photometric catalog ( [McCracken et al., 2003| ). uj{9) allows 
us to illustrate the level of distortions introduced by our 
angular selection function. 

To this end, figure |S1 shows the angular correlation 
function computed from one mock VVDS redshift survey 
without correcting for these effects (i.e. using a random 
sample which simply follows the geometrical borders of 
the galaxy sample, as one would do for a homogeneous 
angular selection), compared to that of the original mock 
catalog. We used the angular version of the Landy-Szalay 
estimator (eq. without taking into account any incom- 
pleteness on any scales. The comparison to the parent sur- 
vey to (9) reveals the very strong distortions introduced 
over a wide range of angular scales. 
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Fig. 9. Redshift-space two-point correlation function ^(s) for one mock WDS-02h field, computed in four redshift 
bins. The true ^(.s) computed for the whole parent sample (stars) is compared to that measured from the "observed" 
sample, first without any correction (open circles, left four panels) and then applying our correction scheme (triangles, 
right four panels). Error bars are the la ensemble rms among the 50 VVDS mock samples. 
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Fig. 10. Same as Figure 9, but for the ^{rp,n) correlation function. The contours correspond to values for ^(rp,7r) 
of 0.4, 1 (bold), 2.0, 5.0. Dashed hnes refer to the complete mock sample, while solid ones describe the sample after 
applying the VVDS selection function. 



6.2. Correction scheme 

The biases discussed so far involve introducing two types 
of corrections which we discuss in detail in this section. 

1) Global correction. To account for the effects of un- 
even boundaries and varying sampling rate we construct a 
random catalog, which consists of the same number of sep- 



arately created pointings as the galaxy sample, thus repro- 
ducing the global "exposure map" (i.e. number of multiple 
passes over a given point of the sky) and the corresponding 
large-scale surface density variations of the galaxy redshift 
sample. The holes and excised regions in the photometric 
sample are similarly taken into account by applying the 
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Fig. 11. Same as Figures 9 and 10, but for the projected function Wp{rp), measured before (dashed hue) and after 
(solid line) the full observing strategy has been applied. This comparison shows that our method is able to properly 
recover Wp{rp). We note, however, that, being closely related to the angular function, Wp{rp) remains the most sensitive 
among the 3D correlation functions to the observational biases and the most difficult to recover properly in all rp bins. 



same binary mask to the random sample. These first-order 
corrections account already for most of the observational 
biases. When taken into account, they reduce most of the 
negative effects of the observing strategy on the correla- 
tion functions, in particular the global overestimation of 
correlation functions (see Figures |51 ^1 . 

2) Small scale correction. What remains to be cor- 
rected is the slight bias introduced by the slit-positioning 
software and the mechanical limitations (slit size, close- 
ness of slits and so forth). We have seen that the SSPOC 
selection is not an entirely random sampling of the actual 
angular distribution of objects, but rather a more homo- 
geneous sub-set, preferentially concentrated along specific 
rows. This selection affects primarily the small-scale values 
of the correlation function, corresponding to the typical 
slit size: with only one spectroscopic pass, pairs of galax- 
ies with separation smaller than the slit size will always 
have only one galaxy observed, and thus their contribu- 
tion to ^ will be lost. With repeated passes this problem is 
alleviated, as the software chooses each time different ob- 
jects (except for a small number of objects observed twice 
for error checking purposes). Using the full 2D informa- 
tion available from the parent photometric catalog (that 
tells us how many galaxies on the sky have been missed 
in the spectroscopic sample), we developed a weighting 
scheme that weighs each targeted galaxy proportionally to 
its "representativity" in terms of local angular pair den- 
sity. 

Let us therefore consider a circular region of radius 0^ 
around a galaxy i located within a specific redshift slice 
k, and define inside 0^ the following quantities: 



n-gai{i) ~ the number of galaxies in the parent photometric 
catalog 

n-z{i) - the number of galaxies with measured redshift 
ninii) - the subset of these belonging to the same redshift 
slice as the central galaxy 

TT-expii) ~ the number of galaxies expected to belong to the 
same redshift slice, which can be written as 



xp (^) 



(8) 



with Urern being the fraction of unobserved neighboring 
galaxies in the parent photometric catalog expected to be- 
long to the same redshift slice. This number can be written 

as 



(9) 



where PsUce is the probability that a generic measured 
galaxy belongs to that specific redshift slice. Here one can 
make the reasonable assumption that the observed red- 
shift distribution is sufficiently well sampled as to pro- 
vide, when averaged over a suitably chosen radius 9a, an 
unbiased estimate of Psiice for any fc*'' slice as 



Pslic 



N,,total{< 9 a) 



(10) 



The choice of 9a is clearly critical, as it has to be large 
enough to allow a proper sampling of existing structures 
along the line of sight (and thus minimize the noise intro- 
duced by the weight), but also small enough not to dilute 
the effect of single structures within one redshift slice. In 
practice, given the current size of the 02h field (~ 0.5 
square degrees), we have obtained the best results using 
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9a = 30', which encloses virtually the entire field. Note 
also that ngai{i), i-e. the number of galaxies in the parent 
catalog, will be given by ngai{i) = Uaiiii) * fgaU with fgai 
being the probability that a randomly chosen object from 
the photometric catalog is not a star but a galaxy and 
naiiii) - the number of all locally observed objects in this 
catalog. For the actual VVDS 02h field, this probability 
has been estimated to be fgai — 0.92. 

The construction of the actual weight to recover the 
loss of small-scale pairs produced essentially by the prox- 
imity bias is not unequivocal. After several experiments 
with weighting by local densities (of expected vs. observed 
spectra), wc obtained the best results weighting by pairs. 
The two-point correlation function being a pair-weighted 
statistic, we constructed our weight w{i) for a given galaxy 
i from the ratio of the expected to the measured number 
of pairs within 6^. Specifically, if one wants the local an- 
gular pair density to be conserved, each pair should be 
counted as: 

U) - 1) 



'w{i) * w{j) 



[i) * [riinii) - 1) 
And, consequently, a single object is assigned a weight 



(11) 



^^^^ _ nexpji) * {nexpji) - 1) 



(i) * {nin{i) - 1) 



(12) 



To define the optimal angular size 9^ defining the "neigh- 
borhood" of a galaxy, we experimented with different val- 
ues in the range 5" to 1'. Not surprisingly, the best cor- 
rection is obtained for 9^ in the range 30 — 45", which is 
comparable to the length of the VIMOS spectra as pro- 
jected on the sky. In all computations presented here, we 
adopted the value 9^ — 40". 

The following sections will present the results of exten- 
sive tests of this correction scheme, based on the GallCS 
mock VVDS surveys. 

6.3. Application to redshift-space correlations 

We have applied the manipulations presented in the previ- 
ous section to our mock VVDS 02h surveys and compared 
the results to those obtained from the whole Ideg x 1 deg 
mock fields. The results are shown in Figures El El and 
111! for the same mock sample used for measuring uj{9) 
(Figure jSJ, split into 4 rcdshift bins. In each of these fig- 
ures, comparison of the the four left to the four right pan- 
els demonstrates the effect of the overall correction. In 
general, in redshift space the effect of the observational 
biases is much less severe, being diluted by the unaffected 
clustering measured along the line of sight. Still, we see 
how a proper estimate does require a correction. 

Looking at ^(s) (Figure O, we see that the correction 
introduced by our scheme is in general very good. The full 
bi-dimcnsional correlation function ^{rp,7T) (Figure I10|l 
shows the effect in more detail, indicating also that the 
impact of the angular bias on spatial correlations depends 
on redshift. This is to be expected, given that a fake in- 
homogeneity at a given angular scale affects larger spatial 



scales at larger redshifts. However, as seen from the four 
right panels the bulk of the problem is corrected by our 
technique. 

Finally, the corresponding projected function, Wp{rp) 
, which is the one that will be fitted to estimate the real- 
space correlation length and slope, (Le Fevre et al. 2005), 
does not show any significant systematic effect, nor scale- 
dependent bias (see also S 16.41 below), if one excludes a 
residual effect in the highest-redshift bin (which may be 
specific of the mock sample used) . 

6.4. Accuracy in recovering vq and 7 

Let us now evaluate more quantitatively how well the 
weighting scheme is able to recover the correct values of 
the two parameters of ^(r) , rp and 7. Figure [T^ plots the 
projected correlation function Wp, computed for one of 
the VVDS mock cones, together with the measured best 
fit values of rp and 7. The error contours are estimated 
from the variance of the 50 mock surveys as described 
previously and their size depends mainly on the number 
of galaxies within each bin. Figure [T51 shows that the evo- 
lution of clustering we "observe" in this specific simulated 
VVDS cone agrees quite well with its parent sample. 

Of course, due to cosmic variance, the values of rg 
and 7 differ between different simulated cones. Figure 1141 
shows the spread of these parameters among all the 50 
mock VVDS surveys and their parent catalogs, for a rep- 
resentative redshift bin {z = [0.5 — 0.7]). This behavior 
is similarly seen in the other redshift bins, indicating an 
increased spread in the parameter estimates in the "ob- 
served" catalogs, an effect easily explained in terms of the 
smaller number of objects. Figure El and Figure El also 
indicate that at the end of our correction process any pos- 
sible systematic effect is reduced to less than 5%, a value 
always significantly smaller than the uncertainty due to 
cosmic variance which is of the order of 15-20%. 



6.5. Tests of VVDS observing strategy 

In this Section we want to discuss from a more general per- 
spective (i.e. not limited to the current status and lay-out 
of the 02h field) how the accuracy of correlation measure- 
ments can depend on the number of multiple spectroscopic 
pointings ( "passes" ) that are dedicated to a specific area. 
In other words: are multiple passes increasing — as ex- 
pected — the accuracy of correlation function measure- 
ments, not only thanks to the increased statistics, but 
also because of the improved sampling of the clustering 
process? And how is our correcting scheme performing 
when handling a very sparse (one pass) or a more densely 
sampled area? This is clearly an interesting question for 
the future development of the VVDS, or other surveys, as 
these tests can indicate what strategy could be more effi- 
cient. One would like to estimate the fraction of galaxies 
necessary to recover the correlation signal to a certain level 
of accuracy. This, translated to the VVDS, implies deter- 
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Fig. 12. Evolution of the projected function Wp{rp) (left 
column) and the corresponding best-fit parameters of 
^(r) , tq and 7 (right column), as seen in one of the 
VVDS mock surveys. Error bars are computed as ex- 
plained in the text, while error contours on the fit 
parameters are obtained taking into account the full 
covariance matrix. The 68.3%, 90% and 95.4% joint 
confidence levels are defined as in Numerical Recipes 
( [Numerical Recipes, Press et al., 19921 chapter 15.6) in 
terms of the corresponding likelihood intervals that we 
obtain from our fitting procedure (see S I5.2|I . 



mining how many spectroscopic "passes" with VIMOS are 
necessary. Note that the answer is not trivial, since mul- 
tiple pointings over the same area are usually dithered 
(i.e. shifted by an amount at least larger than the central 
"cross", i.e. 2'), and thus a larger number of passes over 
the same area, while improving the sampling, introduces 
also a more complex mean density pattern, as explained 
in section 2.1. 

Tests have been performed creating a grid of six point- 
ings, spaced with the same step as the real VVDS ones in 
the VVDS-lOh field. The second pass was then arranged 
over a grid shifted by 2' in right ascension and declination. 
The pointings of both passes have then been " observed" 
once again with a different selection of objects for spec- 
troscopy. At the end (maximum coverage), this resulted in 
an area of 0.3624 square degrees, mostly uniformly covered 
but with small patches of sky that were observed either 
three, two or one times or remained unobserved. The re- 
sults for Wp{rp) and £^{s) are shown in Figures [T5l and [T?)l 
respectively. 

The projected correlation function Wp is fairly well re- 
covered almost independently of the sampling density. For 
a single pass, power is not recovered properly at scales be- 
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Fig. 13. Evolution of vq in a VVDS mock survey (filled 
circles), compared to that of its parent catalog (open cir- 
cles). Error bars are as explained in the text. The "true" 
and "measured" values of tq are very consistent within 
the error bars, providing an internal proof of the quality 
of our correction scheme. 
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Fig. 14. Histograms of the measurements of ro and 7 in 
the redshift bin [0.5 - 0.7] (chosen as a representative case), 
among the 50 mock catalogs, for the full cones (left col- 
umn) and for the observed samples (right column) , where 
the full weighting scheme has been applied. The ensemble 
averaged values of Tq and 7 are indicated in each panel, 
together with their rms error. 
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Fig. 15. Measured Wp{rp) in the case of different number 
of passes over tlie same field. When the field is observed 
only once we are clearly not able to properly recover prop- 
erly Wp{rp) on the smallest scale. When we observe the 
field more times the recovery is much better also on the 
small scales. 

low 0.6 h^^Mpc, since there is in practice no pair (even 
biased) to be "corrected" in a proper statistical way by 
our scheme. 

The case of (^(s) (Figure shows even more clearly 
the difficulty of recovering very small scale pairs with only 
one pass: in this case, there is an intrinsic low-scale limita- 
tion (complete lack of pairs), which cannot be fully over- 
come by the correcting scheme. The figure shows, for ex- 
ample, that while a linear bin between and 1 h^^ Mpc is 
already sufficient to recover the correct clustering ampli- 
tude even with one pass, smaller logarithmic bins below 1 
h~^ Mpc are inadequate and suffer from the lack of mea- 
sured pairs. 

We conclude that even in the fields that were observed 
only with one spectroscopic observation, sampling about 
15% of the photometric targets down to Iab = 24, the 
two-point correlation function can be measured quite well 
for separations 1 < r < lOh"^ Mpc. The results confirm, 
however, that observing fields four times, sampling about 
40% of the population as in the deep part of the VVDS, 
provides the possibility of more precise measurements on 
scales down to 0.1 h^^ Mpc. 

7. Summary and conclusions 

One of the key goals of the VVDS survey is to measure the 
evolution of the galaxy clustering from the present epoch 
up to z ^ 2 and larger. To study in detail the error bud- 
get of ^(r) measurements in the VVDS survey, we have 
generated a set of mock catalogs using the GallCS model 



Fig. 16. Measurements of ^(s) for a different number of 
observing "visits" over the same field. 



of semi-analytic galaxy formation. The geometry of the 
VVDS survey on the sky is complex due to the observing 
strategy. The resulting selection function substantially af- 
fects the angular correlation properties of the clustering 
of the observed galaxies. We demonstrate that the corre- 
lation observed in redshift space is much less affected and 
that the bias introduced by the observing strategy can be 
largely removed using the correcting scheme we propose 
in this paper. 

We conclude that, for the first epoch VVDS data, we 
can expect to measure ^{s) and Wp{rp) to better than 10% 
on scales 1 < r < 10 h^^ Mpc, and better than 30% be- 
low 1 h~^ Mpc. Results obtained from the GallCS sim- 
ulations indicate that the two-point correlation functions 
computed from the First Epoch VVDS should suffer only 
from a modest cosmic variance of ~ 15 — 20%. These re- 
sults suggest that after the final selection of objects for 
spectroscopy the variance becomes twice as large as the 
variance of the underlying parent galaxy field in the same 
area. We expect, in each redshift slice Az ~ 0.2 in the 
redshift range z= [0.2,2.1], to measure rg and 7 with an 
accuracy better than 15 — 20%. We show that any resid- 
ual systematic effect in the measurements of vq and 7 is 
below 5%, i.e. a value much smaller than the cosmic errors. 

The actual measured clustering properties of galax- 
ies in the VVDS survey, using the framework outlined in 



this paper, are presented in Le Fevre et al. (2005c) and in 
forthcoming papers. 
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